Efficient lexical retrieval for English text-to-speech synthesis
نویسندگان
چکیده
We present a first version of a filter dictionary for use in a computer-telephony text-to-speech synthesis system. The aim of the filter dictionary was to provide a lexicon that was compact, fast and had broader coverage than the standard dictionary used to create it. Correct phonemic transcriptions and lexical stress assignment were both required for a transcription to be deemed accurate. The approach taken here guarantees 100% accurate coverage of the original dictionary, but also gives 93% accurate transcription of the expected coverage of novel words. Lexical stress and the phonemic transcription were retrieved in one pass, resulting in an extremely fast system. We also allowed userdefinition to retain accuracy for non-standard transcriptions. This algorithm was developed for British English, but could be applied to other languages.
منابع مشابه
L2 Learners’ Lexical Inferencing: Perceptual Learning Style Preferences, Strategy Use, Density of Text, and Parts of Speech as Possible Predictors
This study was intended first to categorize the L2 learners in terms of their learning style preferences and second to investigate if their learning preferences are related to lexical inferencing. Moreover, strategies used for lexical inferencing and text related issues of text density and parts of speech were studied to determine their moderating effects and the best predictors of lexical infe...
متن کاملA database design for a TTS synthesis system using lexical diphones
Database designs, if based on the premise that there are about 2000 diphones in English, as stated in many publications and on-line documents, are likely to render a database of diphones, which will fail to capture some important phonological phenomena of English. This paper proposes a TTS database, which is built from diphones inclusive of their syllabic stress; we term these units lexical dip...
متن کاملProPOSEC: A Prosody and PoS Annotated Spoken English Corpus
We have previously reported on ProPOSEL, a purpose-built Prosody and PoS English Lexicon compatible with the Python Natural Language ToolKit. ProPOSEC is a new corpus research resource built using this lexicon, intended for distribution with the Aix-MARSEC dataset. ProPOSEC comprises multi-level parallel annotations, juxtaposing prosodic and syntactic information from different versions of the ...
متن کاملThe Relationship between Syntactic and Lexical Complexity in Speech Monologues of EFL Learners
: This study aims to explore the relationship between syntactic and lexical complexity and also the relationship between different aspects of lexical complexity. To this end, speech monologs of 35 Iranian high-intermediate learners of English on three different tasks (i.e. argumentation, description, and narration) were analyzed for correlations between one measure of sy...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کامل